Monitoring groundwater storage at actionable spatial scales remains a major challenge in regions experiencing chronic groundwater stress. This study presents a machine learning–based framework to downscale GRACE-derived Total Water Storage Anomalies (TWSA) from 111 km to 5 km over West Bengal, India, enabling improved characterization of spatial and seasonal groundwater variability. Three algorithms—XGBoost, Random Forest, and Support Vector Regression—were evaluated using multivariate hydroclimatic and vegetation predictors. Among them, XGBoost demonstrated superior performance (R^2=0.91, NSE=0.91) and most effectively captured nonlinear groundwater dynamics.
The 5-km downscaled product reveals pronounced sub-regional heterogeneity and evolving groundwater depletion hotspots that are not detectable in native GRACE data. Seasonal analyses show weakening post-monsoon recovery and an expansion of groundwater stress from traditionally affected southwestern districts toward broader southern and western regions during 2003–2023. These results highlight a growing imbalance between recharge and extraction driven by hydroclimatic variability and anthropogenic pressure. By overcoming the spatial limitations of GRACE, this approach provides high-resolution insights critical for groundwater monitoring, management, and policy planning. The proposed framework is transferable to other data-scarce and hydrogeologically complex regions, supporting more informed groundwater sustainability assessments.
Introduction
Groundwater is a vital resource in the region but is being rapidly depleted due to population growth, urbanization, and intensive irrigation. Traditional measurements are accurate but limited in coverage, while GRACE satellite data provides large-scale groundwater information but at a coarse resolution (~111 km), making it unsuitable for local analysis.
To overcome this, the study applies machine learning techniques—XGBoost, Random Forest, and Support Vector Regression—to downscale GRACE-derived Terrestrial Water Storage Anomalies to a finer 5 km resolution. This is achieved by integrating multiple high-resolution datasets, including precipitation, evaporation, runoff (ERA5-Land), groundwater storage (GLDAS), evapotranspiration (TerraClimate), and vegetation indices (MODIS NDVI).
The study area, West Bengal, shows strong hydrological and climatic variability, ranging from Himalayan regions to deltaic plains, with issues like arsenic contamination, fluoride, and salinity affecting groundwater quality.
The methodology involves preprocessing and aligning multi-source datasets, training ML models at coarse resolution, and applying learned relationships to generate detailed groundwater maps across 3,237 grid cells over 2003–2023.
Conclusion
A concept and results analysis is given regarding a data-driven framework that aims to improve the resolution of GRACE-derived groundwater storage anomalies in West Bengal at a resolution that is improved from 111 km to 5 km in a region that is under chronic groundwater stress. Based on GRACE TWSA and using multi-variate satellite hydrological predictors, a characterization that is not available through GRACE natural resolution is given. Among the algorithms used in this study, XGBoost had the best prediction performance and was effective in representing non-linear behavior in groundwater variability and is thus recommended. The 5km downscaled map depicts variability in groundwater stress on a seasonal basis from 2003 to 2023. Groundwater stress increased in magnitude and occupied larger geographic areas, transitioning from isolated hotspot areas in the southwest to southeastern and western parts. Inter-seasonal variations also reveal a decline in post-monsoon recharge in more recent years, reflecting a rising mismatch between recharge and withdrawals. These observed dynamics align with increased agricultural abstraction, precipitation variability, as well as rising urban-industrial water demands. High-resolution maps facilitate the demarcation of geographic areas that experience severe groundwater stress at a sub-district level, which also serve as inputs for decision-making. Methodologically speaking, this research Highlights inspires the capabilities of downscaling with an ML approach to extend available satellite data, contributing to better groundwater characterization in regions where data is scarce. It can easily apply to other basins on the planet, improving with data input on hydrological, levels, and future satellite mission data to better fill in data on groundwater management, especially in a climate where pressures increase every second with climate change.
References
[1] Naghibi, S.A., Pourghasemi, H.R. & Dixon, B. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environ Monit Assess 188, 44 (2016). https://doi.org/10.1007/s10661-015-5049-6.
[2] Syed, T. H., Famiglietti, J. S., Rodell, M., Chen, J., & Wilson, C. R. (2008). Analysis of terrestrial water storage changes from GRACE and GLDAS. Water Resources Research, 44(W02433). https://doi.org/10.1029/2006WR005779
[3] Feng, W., Zhong, M., Lemoine, J.-M., Biancale, R., Hsu, H.-T., & Xia, J. (2013). Evaluation of groundwater depletion in North China using the Gravity Recovery and Climate Experiment (GRACE) data and ground-based measurements. Water Resources Research, 49(5), 2110–2118. https://doi.org/10.1002/wrcr.20192
[4] Tapley, B. D., Bettadpur, S., Ries, J. C., Thompson, P. F., & Watkins, M. M. (2004). GRACE measurements of mass variability in the Earth system. Science, 305(5683), 503–505. https://doi.org/10.1126/science.1099192
[5] Wilby, R. L., Wigley, T. M. L., Conway, D., Jones, P. D., Hewitson, B. C., Main, J., & Wilks, D. S. (1998). Statistical downscaling of general circulation model output: A comparison of methods. Water Resources Research, 34(11), 2995–3008. https://doi.org/10.1029/98WR02577
[6] Abatzoglou, J. T., Dobrowski, S. Z., Parks, S. A., & Hegewisch, K. C. (2018). TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958-2015. Scientific Data, 5, 1–13. https://doi.org/10.1038/sdata.2017.191
[7] Breiman, L. E. O. (2001). Random ForestsBreiman, L. E. O. (2001). Random Forests. 5–32. 5–32.
[8] Chen, L., He, Q., Liu, K., Li, J., & Jing, C. (2019). Downscaling of GRACE-derived groundwater storage based on the random forest model. Remote Sensing, 11(24). https://doi.org/10.3390/rs11242979
[9] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13-17-Augu, 785–794. https://doi.org/10.1145/2939672.2939785
[10] Chowdhury, U. K., Biswas, B. K., Chowdhury, T. R., Samanta, G., Mandal, B. K., Basu, C., Chanda, C. R., Lodh, D., Saha, K. C., Mukherjee, S. K., Roy, S., Kabir, S., Quamruzzaman, Q., & Chakraborti, D. (2000). Groundwater Arsenic Contamination in Bangladesh and West Bengal , India. 108(5), 393–397.
[11] Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., van de Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., … Vitart, F. (2011). The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quarterly Journal of the Royal Meteorological Society, 137(656), 1113–597. https://doi.org/10.1002/qj.828
[12] Fang, H., Beaudoing, H. K., Rodell, M., Teng, W. L., & Vollmer, B. E. (2009). Global Land Data Assimilation System (GLDAS) products, services and application from NASA Hydrology Data and Information Services Center (HDISC). American Society for Photogrammetry and Remote Sensing Annual Conference 2009, ASPRS 2009, 1, 151–159.
[13] https://doi.org/10.1002/wrcr.20192
[14] Galodha, A., Kayithi, N. S., Sharma, D., & Jain, P. (2023). Monitoring groundwater storage basins and hydrological changes using the grace satellite and sentinel-1 for the ganga river basin. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives, 48(M-3–2023), 95–100. https://doi.org/10.5194/isprs-archives-XLVIII-M-3-2023-95-2023
[15] Hasan, E., Tarhule, A., & Kirstetter, P. E. (2021). Twentieth and twenty-first century water storage changes in the nile river basin from grace/grace-fo and modeling. Remote Sensing, 13(5), 1–30. https://doi.org/10.3390/rs13050953
[16] Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., … Thépaut, J. N. (2020). The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society, 146(730), 1999–2049. https://doi.org/10.1002/qj.3803
[17] Justice, C. O., Townshend, J. R. G., Vermote, E. F., Masuoka, E., Wolfe, R. E., Saleous, N., Roy, D. P., & Morisette, J. T. (2002). An overview of MODIS Land data processing and product status. Remote Sensing of Environment, 83(1–2), 3–15. https://doi.org/10.1016/S0034-4257(02)00084-6
[18] Khorrami, B., & Gunduz, O. (2021). Evaluation of the temporal variations of groundwater storage and its interactions with climatic variables using GRACE data and hydrological models: A study from Turkey. Hydrological Processes, 35(3), 1–13. https://doi.org/10.1002/hyp.14076
[19] Lavers, D. A., Simmons, A., Vamborg, F., & Rodwell, M. J. (2022). An evaluation of ERA5 precipitation for climate monitoring. Quarterly Journal of the Royal Meteorological Society, 148(748), 3152–3165. https://doi.org/10.1002/qj.4351
[20] Long, D., Chen, X., Scanlon, B. R., Wada, Y., Hong, Y., Singh, V. P., Chen, Y., Wang, C., Han, Z., & Yang, W. (2016). Have GRACE satellites overestimated groundwater depletion in the Northwest India Aquifer? Scientific Reports, 6(April), 1–11. https://doi.org/10.1038/srep24398
[21] Majumdar, S., Smith, R., Butler, J. J., & Lakshmi, V. (2020). Groundwater Withdrawal Prediction Using Integrated Multitemporal Remote Sensing Data Sets and Machine Learning. Water Resources Research, 56(11), 1–18. https://doi.org/10.1029/2020WR028059
[22] Pulla, S. T., Yasarer, H., & Yarbrough, L. D. (2023). GRACE Downscaler?: A Framework to Develop and Evaluate Downscaling Models for GRACE.
[23] Rafik, A., Ait Brahim, Y., Amazirh, A., Ouarani, M., Bargam, B., Ouatiki, H., Bouslihim, Y., Bouchaou, L., & Chehbouni, A. (2023). Groundwater level forecasting in a data-scarce region through remote sensing data downscaling, hydrological modeling, and machine learning: A case study from Morocco. Journal of Hydrology: Regional Studies, 50(August), 101569. https://doi.org/10.1016/j.ejrh.2023.101569
[24] Rodell, M., Houser, P. R., Jambor, U., Gottschalck, J., Mitchell, K., Meng, C. J., Arsenault, K., Cosgrove, B., Radakovich, J., Bosilovich, M., Entin, J. K., Walker, J. P., Lohmann, D., & Toll, D. (2004). The Global Land Data Assimilation System. Bulletin of the American Meteorological Society, 85(3), 381–394. https://doi.org/10.1175/BAMS-85-3-381
[25] Shokri, A., Walker, J. P., van Dijk, A. I. J. M., & Pauwels, V. R. N. (2019). On the Use of Adaptive Ensemble Kalman Filtering to Mitigate Error Misspecifications in GRACE Data Assimilation. Water Resources Research, 111(9), 7622–7637. https://doi.org/10.1029/2018WR024670
[26] Syed, T. H., Famiglietti, J. S., Rodell, M., Chen, J., & Wilson, C. R. (2008). Analysis of terrestrial water storage changes from GRACE and GLDAS. Water Resources Research, 44(2), 1–15. https://doi.org/10.1029/2006WR005779
[27] Tapley, B. D., Bettadpur, S., Ries, J. C., Thompson, P. F., & Watkins, M. M. (2004). GRACE measurements of mass variability in the Earth system. Science, 305(5683), 503–505. https://doi.org/10.1126/science.1099192
[28] Valley, C., Agarwal, V., Akyilmaz, O., Shum, C. K., Feng, W., & Yang, T. (2022). Machine learning based downscaling of GRACE-estimated groundwater in.
[29] Wilby, R. L., Wigley, T. M. L., Conway, D., Jones, P. D., Hewitson, B. C., Main, J., & Wilks, D. S. (1998). Statistical downscaling of general circulation model output: A comparison of methods. Water Resources Research, 34(11), 2995–3008. https://doi.org/10.1029/98WR02577
[30] Zaitchik, B. F., Rodell, M., & Reichle, R. H. (2008). Assimilation of GRACE terrestrial water storage data into a land surface model: Results for the Mississippi River basin. Journal of Hydrometeorology, 9(3), 535–548. https://doi.org/10.1175/2007JHM951.1
[31] Zhang, J., Liu, K., & Wang, M. (2021). Downscaling groundwater storage data in China to a 1-km resolution using machine learning methods. Remote Sensing, 13(3), 1–19. https://doi.org/10.3390/rs13030523